Executive Summary

The main aim of this study is to predict average daily electricity demand to enable NESO to ensure a reliable supply. Forecasting is performed using an ordinary least squares linear regression model. This is selected by identifying key temporal and weather based predictors as well as considering important metrics such as \(R^2\) and AIC. The model is then evaluated for it’s practicality and predictive ability using cross validation. We <<<<<<< HEAD ultimately conclude, the model accurately predicts average daily demand ======= ultimately conclude, the model predicts average daily demand accurately >>>>>>> origin/main while remaining simple and practical.

Data Analysis and Exploration

Prior to fitting any models we examine the dataset used in our analysis. We consider the following predictor variables for our linear model:

<<<<<<< HEAD

The figures and observations from our data exploration are recorded below.

Model Selection and Evaluation

We aim to fit an ordinary least-squares linear regression model of the form \[\mathbf{y} = X \boldsymbol{\beta} + \boldsymbol{\epsilon}\] assuming

=======

Figures and observations from our data exploration are recorded below.

Model Selection and Evaluation

We fit an ordinary least-squares linear regression model: \[\mathbf{y} = X \boldsymbol{\beta} + \boldsymbol{\epsilon}\] assuming

>>>>>>> origin/main

Metrics

<<<<<<< HEAD

Variable significance is assessed using hypothesis test on model =======

Variable significance is assessed through hypothesis testing model >>>>>>> origin/main parameters, where a high p-value indicates no statistically significant effect in the model. The \(R^2\) metric measures the proportion of variance explained. Models with a higher adjusted \(R^2\) are generally preferred, as this indicates that the model explains a higher proportion of the variability within the data. Predictive accuracy is assessed through root mean squared error (RMSE). It measures the average magnitude of the residuals, with lower values indicating that the model’s predictions are closer to the observed data. The AIC is used to balance model fit with complexity by penalising the number of parameters included in the model. A model with lower AIC is preferred, as it explains the data well without including unnecessary parameters.

Model selection

Based on the exploratory analysis, the final model was selected to <<<<<<< HEAD capture the most important predictors and accounting for non-linearities and interactions while balancing simplicity:

======= capture key predictors and account for non-linearities and interactions while balancing simplicity:

>>>>>>> origin/main

\[\begin{eqnarray*} M_F: y_i = \beta_0 + \beta_1 \text{solar}_i + \beta_2 \text{wdaynumber}_i + \beta_3 \text{month}_i + \beta_4 \text{year}_i + \beta_5 \text{DSN}^2_i + \beta_6 \text{TE}_i + \beta_7 \, <<<<<<< HEAD \text{DSN_i}^2\text{:month_i} + \epsilon_i \end{eqnarray*}\] where \(\epsilon_i \overset{\mathrm{iid}}\sim \mathcal{N}(0, \sigma^2)\).

The TE variable was selected as the temperature measure because it incorporates information from both the current and previous day, capturing lagged effects in demand. When compared to alternatives, the model using TE produced a higher \(R^2\), lower RMSE and lower AIC indicating a better overall model fit and predictive performance. The ======= \text{DSN}^2\text{:month}_i + \epsilon_i \end{eqnarray*}\] where \(\epsilon_i \overset{\mathrm{iid}}\sim \mathcal{N}(0, \sigma^2)\).

The TE variable was selected as the temperature measure because it incorporates both the current and previous day information, capturing lagged effects. Models using TE produced a higher \(R^2\), lower RMSE and lower AIC indicating a better overall model fit and predictive performance. The >>>>>>> origin/main DSN variable exhibited a quadratic relationship with demand, so \(DSN^2\) was included. Month and Year were included as factor variables to account for annual variation. A distinct dip in demand in December motivated the inclusion of an interaction between DSN^2 and Month, allowing the model to capture monthly variations in the quadratic effect and substantially improving <<<<<<< HEAD the AIC (decrease of 1,846.95).

======= the AIC by 1,846.95.

>>>>>>> origin/main
Table 1: Model Performance Comparison
Model \(R^2\) Adjusted \(R^2\) RMSE AIC
Model with temp 0.8585733 0.8569565 1533.105 62062.24
Model with TO 0.8478952 0.8461564 1589.928 62319.91
Model with TE 0.8696681 0.8681782 1471.741 61773.03

The remaining variables were included due to trends seen in <<<<<<< HEAD exploratory analysis and as they are statistically significant, most ======= exploratory analysis and as they are statistically significant, mostly >>>>>>> origin/main with \(p<0.01\).

Table 2: Regression Results
Estimate Std. Error t value Pr(>|t|)
(Intercept) 33526.962 321.157 104.394 0.000
TE -529.501 10.654 -49.700 0.000
solar_sarah -15846.613 1141.001 -13.888 0.000
factor(WeekdayNum)1 5596.396 93.053 60.142 0.000
factor(WeekdayNum)2 6324.780 93.121 67.920 0.000
factor(WeekdayNum)3 6360.136 93.097 68.317 0.000
factor(WeekdayNum)4 6279.360 93.250 67.339 0.000
factor(WeekdayNum)5 5464.630 93.218 58.622 0.000
factor(WeekdayNum)6 1169.326 93.079 12.563 0.000
I(DSN^2) 0.660 0.041 16.114 0.000
factor(Month)2 4825.442 455.153 10.602 0.000
factor(Month)3 5306.641 496.149 10.696 0.000
factor(Month)11 3286.746 266.867 12.316 0.000
factor(Month)12 9149.686 289.650 31.589 0.000
factor(Year)1992 -34.595 226.488 -0.153 0.879
factor(Year)1993 -26.648 226.925 -0.117 0.907
factor(Year)1994 453.790 227.098 1.998 0.046
factor(Year)1995 1502.790 226.743 6.628 0.000
factor(Year)1996 2365.578 227.285 10.408 0.000
factor(Year)1997 2804.758 227.065 12.352 0.000
factor(Year)1998 3134.583 227.123 13.801 0.000
factor(Year)1999 3779.208 226.878 16.657 0.000
factor(Year)2000 4435.593 226.786 19.558 0.000
factor(Year)2001 5206.297 226.702 22.965 0.000
factor(Year)2002 5574.093 227.401 24.512 0.000
factor(Year)2003 6347.567 226.968 27.967 0.000
factor(Year)2004 6576.452 226.678 29.012 0.000
factor(Year)2005 5704.218 226.901 25.140 0.000
factor(Year)2006 5376.686 226.954 23.691 0.000
factor(Year)2007 4761.861 227.234 20.956 0.000
factor(Year)2008 4100.558 226.620 18.094 0.000
factor(Year)2009 2600.179 226.727 11.468 0.000
factor(Year)2010 2864.807 228.077 12.561 0.000
factor(Year)2011 2258.601 227.157 9.943 0.000
factor(Year)2012 1629.380 226.662 7.189 0.000
factor(Year)2013 1119.108 227.056 4.929 0.000
factor(Year)2014 488.905 227.196 2.152 0.031
I(DSN^2):factor(Month)2 -0.708 0.053 -13.357 0.000
I(DSN^2):factor(Month)3 -0.728 0.047 -15.418 0.000
I(DSN^2):factor(Month)11 0.502 0.225 2.231 0.026
I(DSN^2):factor(Month)12 -3.808 0.079 -48.028 0.000

Model Evaluation

<<<<<<< HEAD

We examine the residuals to check whether the underlying assumptions are satisfied, ensuring the model provides a valid fit to the data.

The =======

The residuals were examined to check the underlying assumptions are satisfied, ensuring the model provides a valid fit to the data.

The >>>>>>> origin/main Residuals vs Fitted plot shows that the variance remains fairly constant, supporting the homoscedacity assumption. The Q-Q plot shows a slight deviation from normality with an S-shaped pattern suggesting slight skewness. However, given the large sample size, the central limit theorem ensures the sampling distribution remains approximately normal. The Scale-Location plot shows minimal heteroscedasticity with only a slight curve in the line. This is not concerning given the large sample size and predictive focus. Finally, the residuals vs leverage plot indicates no problematic points, as none exhibit high leverage or Cook’s distance suggesting no outliers.

Cross Validation:

Methodology

To assess model generalizability, we used expanding window cross-validation. Since, our model includes year as a factor, which would be updated by NESO annually based on their estimated year effect, we altered our cross validation method as follows:

Firstly we fit the model on an initial training set from 1991 up to 2000. Then the year effect is removed from the trained model. Using this, we compute the ‘yearless predictions’ for 2001. To mimic adding back in a year effect we assumed a known year effect, using the coefficient estimated from the full dataset. This was added to the ‘yearless predictions’ to compute our final predictions. We then compare these with the observed data for 2001 and obtain the following metrics:

  • Squared error: \(\text{SE} = (y - \hat{y}_F)^2\)

  • Interval score: \(\text{IS}(\alpha) = U_F - L_F + \frac{2}{\alpha} (L_F - y) \mathcal{1}\{y \leq L_F\} + \frac{2}{\alpha} (y - U_F) \mathcal{1}\{y > L_F\}\) where \(L_F\) and \(U_F\) denote the lower and upper bound of the prediction interval with coverage probability \(\alpha\).

  • Dawid-Sebastiani: \(\text{DS} = \frac{(y - \hat{y}_F)^2}{\sigma^2_F} + \log(\sigma^2_F)\)

We then repeat this process, expanding the training set by one year and moving the test set forward one year.

Predictive Ability and Simplicity

<<<<<<< HEAD

The observed vs predicted values from our cross validation are plotted in Figure 2. The points lie scattered around the \(y=x\) line indicating accurate predictions. A few points lie above the line, however these occur at lower demand levels. Since NESO prioritises accuracy at higher demand levels where shortfalls are more likely to occur, the model remains well-suited for estimating average daily demand, particularly during periods of high demand.

Using cross validation we find the following predictive scores for our final model as well as a basic model:

=======

Table 3 presents the predictive scores from the cross validation for our final model and both a basic model,

\[ M_B: y_i = \beta_0 + \beta_1 \text{solar}_i + \beta_2 \text{wdaynumber}_i + \beta_3 \text{month}_i + \beta_4 \text{year}_i + \beta_5 \text{wind}_i + \beta_6 \text{temp}_i + \epsilon_i \]

and a more complex model,

\[ M_C: y_i = \beta_0 + \beta_1 \text{solar}_i + \beta_2 \text{wdaynumber}_i + \beta_3 \text{month}_i + \beta_4 \text{year}_i + \beta_5 \text{DSN}^2_i + \beta_6 \text{TE}_i + \\ \beta_7 \text{TE:solar}_i + \beta_7 \text{TE:month}_i + \beta_8 \text{solar:month}_i + \beta_9 \text{wdaynumber:month}_i + \beta_9 \text{DSN}^2\text{:month}_i + \epsilon_i \] where \(\epsilon_i \overset{\mathrm{iid}}\sim \mathcal{N}(0, \sigma^2)\).

>>>>>>> origin/main
Table 3: Predictive Scores for Final, Baseline & Complex Model
Model RMSE Mean Dawid Sebastiani Score Mean Interval Score
\(M_F\) 1462.544 15.58537 9085.693
\(M_{B}\) 1872.626 16.09661 12492.786
\(M_{C}\) 1449.466 15.56648 8948.997

Each predictive score for the final model is lower for \(M_F\) than \(M_B\), suggesting higher accuracy in predictions as well as confidence associated with these predictions. However, the more complex model \(M_C\) achieves only marginally better scores, suggesting that a simpler, more practical model is preferable.

To evaluate potential overfitting, we compare the RMSE based on the full dataset with the RMSE from the Cross Validation:

Table 4: How the model performs on trained data vs new data
Fitted RMSE Cross Validation RMSE
1471.741 1462.544

To see that the final model is not overfitting observe that the fitted RMSE is greater than the Cross Validation RMSE. This is unusual however could be explained due to the averaging over the 14 test sets. Also note the difference is minor relative to the scale of the data, indicating that the model performs consistently on both training and new data.

<<<<<<< HEAD

Next consider limitations to our method: by assuming the known year effect from the full model, we leak information about the test year into the predictions. We still choose this method for two reasons: Firstly, it still minimises the disruption to the chronological structure of the data. Secondly, expanding the training set at each step, incorporates more historical data helping the model capture long-term, recurring monthly and weekly trends more effectively.

=======
>>>>>>> origin/main

Limitations

Next consider limitations to our method and model:

By assuming the known year effect from the full model, we leak information about the test year into the predictions. We still choose this method for two reasons: firstly, it still minimises the disruption to the chronological structure of the data. Secondly, expanding the training set at each step, incorporates more historical data helping the model capture long-term, recurring monthly and weekly trends more effectively.

Secondly consider the observed vs predicted values from our cross validation in Figure 9. While the points lie scattered around the \(y=x\) line indicating accurate predictions, a few points lie above the line. However these occur at lower demand levels and since NESO prioritises accuracy at higher demand levels where shortfalls are more likely to occur, the model remains well-suited for its purpose.